Uyghur Language Model with Graphic Structure

نویسندگان

  • Miliwan Xuehelaiti
  • Kai Liu
  • Wenbin Jiang
  • Tuergen Yibulayin
چکیده

This paper describes a novel agglutinative language modeling strategy for Uyghur with graphic language model as structure. In graphic modeling language model, sentences are organized by morphemes as a directed graph, which is different from the linear structure in n-gram language models. The graphic language model is verified in two typical natural language processing application scenarios, morphological analysis and machine translation. The experiments show that the graphic language model achieves significant improvement in both morphological analysis and machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noisy Uyghur Text Normalization

Uyghur is the second largest and most actively used social media language in China. However, a non-negligible part of Uyghur text appearing in social media is unsystematically written with the Latin alphabet, and it continues to increase in size. Uyghur text in this format is incomprehensible and ambiguous even to native Uyghur speakers. In addition, Uyghur texts in this form lack the potential...

متن کامل

Using Discourse Structure-Based Graphic Organizers (GOs) in Developing EFL Learners’ Reading Comprehension

This study investigated the effectiveness of discourse structure-based graphic organizers (GOs) on students’ reading comprehension. Seventy learners were randomly divided in two equal experimental and control groups. They took a pre-test of reading comprehension. Then the experimental group analyzed the textual structures based on the frameworks of GOs and discussed the cohesive and coherent de...

متن کامل

The development of Tagged Uyghur Corpus

The history and development of Uyghur language is introduced. After a brief introduction to the development of Uyghur words, morphology and syntax, we explain our developing of a computer-aided contemporary Uyghur language tagging system. The coverage of this corpus, the resources building, the rules for syncopating and tagging etyma and termination, and the tagging of a corpus using a small ta...

متن کامل

Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging

Uyghur is an agglutinative and a morphologically rich language; natural language processing tasks in Uyghur can be a challenge. Word morphology is important in Uyghur part-of-speech (POS) tagging. However, POS tagging performance suffers from error propagation of morphological analyzers. To address this problem, we propose a few models for POS tagging: conditional random fields (CRF), long shor...

متن کامل

Comparison of functional magnetic resonance imaging in cerebral activation between normal Uygur and Mandarin participants in semantic identification task.

PURPOSE This study utilized blood oxygenation level-dependent functional magnetic resonance imaging (BOLD-fMRI) technology to study the activated cerebral regions in normal participants whose native language was Uyghur or Chinese. METHODS We collected the fMRI data from 15 Uyghur-speaking volunteers and 15 Mandarin-speaking volunteers when executing the semantic identification task and compar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Multimedia

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014